Overview

Dataset statistics

Number of variables14
Number of observations4898
Missing cells0
Missing cells (%)0.0%
Duplicate rows13
Duplicate rows (%)0.3%
Total size in memory535.8 KiB
Average record size in memory112.0 B

Variable types

Numeric12
Categorical2

Warnings

Dataset has 13 (0.3%) duplicate rows Duplicates

Reproduction

Analysis started2021-03-16 14:02:38.721983
Analysis finished2021-03-16 14:04:57.590861
Duration2 minutes and 18.87 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

fixed acidity
Real number (ℝ≥0)

Distinct68
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.854787668
Minimum3.8
Maximum14.2
Zeros0
Zeros (%)0.0%
Memory size38.4 KiB
2021-03-16T11:04:58.445772image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum3.8
5-th percentile5.6
Q16.3
median6.8
Q37.3
95-th percentile8.3
Maximum14.2
Range10.4
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.8438682277
Coefficient of variation (CV)0.1231063993
Kurtosis2.172178465
Mean6.854787668
Median Absolute Deviation (MAD)0.5
Skewness0.6477514746
Sum33574.75
Variance0.7121135857
MonotocityNot monotonic
2021-03-16T11:04:59.789471image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
6.8308
 
6.3%
6.6290
 
5.9%
6.4280
 
5.7%
6.9241
 
4.9%
6.7236
 
4.8%
7232
 
4.7%
6.5225
 
4.6%
7.2206
 
4.2%
7.1200
 
4.1%
7.4194
 
4.0%
Other values (58)2486
50.8%
ValueCountFrequency (%)
3.81
 
< 0.1%
3.91
 
< 0.1%
4.22
< 0.1%
4.43
0.1%
4.51
 
< 0.1%
ValueCountFrequency (%)
14.21
< 0.1%
11.81
< 0.1%
10.72
< 0.1%
10.32
< 0.1%
10.21
< 0.1%

volatile acidity
Real number (ℝ≥0)

Distinct125
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.2782411188
Minimum0.08
Maximum1.1
Zeros0
Zeros (%)0.0%
Memory size38.4 KiB
2021-03-16T11:05:00.803437image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.08
5-th percentile0.15
Q10.21
median0.26
Q30.32
95-th percentile0.46
Maximum1.1
Range1.02
Interquartile range (IQR)0.11

Descriptive statistics

Standard deviation0.1007945484
Coefficient of variation (CV)0.3622561211
Kurtosis5.091625817
Mean0.2782411188
Median Absolute Deviation (MAD)0.06
Skewness1.576979503
Sum1362.825
Variance0.01015954099
MonotocityNot monotonic
2021-03-16T11:05:01.400070image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
0.28263
 
5.4%
0.24253
 
5.2%
0.26240
 
4.9%
0.25231
 
4.7%
0.22229
 
4.7%
0.27218
 
4.5%
0.23216
 
4.4%
0.2214
 
4.4%
0.3198
 
4.0%
0.21191
 
3.9%
Other values (115)2645
54.0%
ValueCountFrequency (%)
0.084
0.1%
0.0851
 
< 0.1%
0.091
 
< 0.1%
0.16
0.1%
0.1056
0.1%
ValueCountFrequency (%)
1.11
< 0.1%
1.0051
< 0.1%
0.9651
< 0.1%
0.931
< 0.1%
0.911
< 0.1%

citric acid
Real number (ℝ≥0)

Distinct87
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3341915067
Minimum0
Maximum1.66
Zeros19
Zeros (%)0.4%
Memory size38.4 KiB
2021-03-16T11:05:01.991704image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.17
Q10.27
median0.32
Q30.39
95-th percentile0.54
Maximum1.66
Range1.66
Interquartile range (IQR)0.12

Descriptive statistics

Standard deviation0.1210198042
Coefficient of variation (CV)0.362127109
Kurtosis6.174900657
Mean0.3341915067
Median Absolute Deviation (MAD)0.06
Skewness1.281920398
Sum1636.87
Variance0.01464579301
MonotocityNot monotonic
2021-03-16T11:05:02.534370image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
0.3307
 
6.3%
0.28282
 
5.8%
0.32257
 
5.2%
0.34225
 
4.6%
0.29223
 
4.6%
0.26219
 
4.5%
0.27216
 
4.4%
0.49215
 
4.4%
0.31200
 
4.1%
0.33183
 
3.7%
Other values (77)2571
52.5%
ValueCountFrequency (%)
019
0.4%
0.017
 
0.1%
0.026
 
0.1%
0.032
 
< 0.1%
0.0412
0.2%
ValueCountFrequency (%)
1.661
 
< 0.1%
1.231
 
< 0.1%
15
0.1%
0.991
 
< 0.1%
0.912
 
< 0.1%

residual sugar
Real number (ℝ≥0)

Distinct310
Distinct (%)6.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.391414863
Minimum0.6
Maximum65.8
Zeros0
Zeros (%)0.0%
Memory size38.4 KiB
2021-03-16T11:05:03.275913image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.6
5-th percentile1.1
Q11.7
median5.2
Q39.9
95-th percentile15.7
Maximum65.8
Range65.2
Interquartile range (IQR)8.2

Descriptive statistics

Standard deviation5.072057784
Coefficient of variation (CV)0.7935735502
Kurtosis3.469820103
Mean6.391414863
Median Absolute Deviation (MAD)3.6
Skewness1.077093756
Sum31305.15
Variance25.72577016
MonotocityNot monotonic
2021-03-16T11:05:03.838186image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
1.2187
 
3.8%
1.4184
 
3.8%
1.6165
 
3.4%
1.3147
 
3.0%
1.1146
 
3.0%
1.5142
 
2.9%
1.899
 
2.0%
1.799
 
2.0%
193
 
1.9%
279
 
1.6%
Other values (300)3557
72.6%
ValueCountFrequency (%)
0.62
 
< 0.1%
0.77
 
0.1%
0.825
0.5%
0.939
0.8%
0.954
 
0.1%
ValueCountFrequency (%)
65.81
< 0.1%
31.62
< 0.1%
26.052
< 0.1%
23.51
< 0.1%
22.61
< 0.1%

chlorides
Real number (ℝ≥0)

Distinct160
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.04577235606
Minimum0.009
Maximum0.346
Zeros0
Zeros (%)0.0%
Memory size38.4 KiB
2021-03-16T11:05:05.041797image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.009
5-th percentile0.027
Q10.036
median0.043
Q30.05
95-th percentile0.067
Maximum0.346
Range0.337
Interquartile range (IQR)0.014

Descriptive statistics

Standard deviation0.02184796809
Coefficient of variation (CV)0.4773179703
Kurtosis37.56459971
Mean0.04577235606
Median Absolute Deviation (MAD)0.007
Skewness5.023330683
Sum224.193
Variance0.0004773337098
MonotocityNot monotonic
2021-03-16T11:05:05.878929image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
0.044201
 
4.1%
0.036200
 
4.1%
0.042184
 
3.8%
0.04182
 
3.7%
0.046181
 
3.7%
0.048174
 
3.6%
0.047171
 
3.5%
0.045170
 
3.5%
0.05170
 
3.5%
0.034168
 
3.4%
Other values (150)3097
63.2%
ValueCountFrequency (%)
0.0091
 
< 0.1%
0.0121
 
< 0.1%
0.0131
 
< 0.1%
0.0144
0.1%
0.0154
0.1%
ValueCountFrequency (%)
0.3461
< 0.1%
0.3011
< 0.1%
0.291
< 0.1%
0.2711
< 0.1%
0.2551
< 0.1%

free sulfur dioxide
Real number (ℝ≥0)

Distinct132
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean35.30808493
Minimum2
Maximum289
Zeros0
Zeros (%)0.0%
Memory size38.4 KiB
2021-03-16T11:05:06.708920image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile11
Q123
median34
Q346
95-th percentile63
Maximum289
Range287
Interquartile range (IQR)23

Descriptive statistics

Standard deviation17.00713733
Coefficient of variation (CV)0.4816782716
Kurtosis11.46634243
Mean35.30808493
Median Absolute Deviation (MAD)11
Skewness1.406744921
Sum172939
Variance289.24272
MonotocityNot monotonic
2021-03-16T11:05:07.411870image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
29160
 
3.3%
31132
 
2.7%
26129
 
2.6%
35129
 
2.6%
34128
 
2.6%
36127
 
2.6%
24118
 
2.4%
28112
 
2.3%
33112
 
2.3%
25111
 
2.3%
Other values (122)3640
74.3%
ValueCountFrequency (%)
21
 
< 0.1%
310
 
0.2%
411
 
0.2%
525
0.5%
632
0.7%
ValueCountFrequency (%)
2891
< 0.1%
146.51
< 0.1%
138.51
< 0.1%
1311
< 0.1%
1281
< 0.1%

total sulfur dioxide
Real number (ℝ≥0)

Distinct251
Distinct (%)5.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean138.3606574
Minimum9
Maximum440
Zeros0
Zeros (%)0.0%
Memory size38.4 KiB
2021-03-16T11:05:08.121601image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum9
5-th percentile75
Q1108
median134
Q3167
95-th percentile212
Maximum440
Range431
Interquartile range (IQR)59

Descriptive statistics

Standard deviation42.49806455
Coefficient of variation (CV)0.3071542543
Kurtosis0.5718532334
Mean138.3606574
Median Absolute Deviation (MAD)29
Skewness0.3907098417
Sum677690.5
Variance1806.085491
MonotocityNot monotonic
2021-03-16T11:05:08.701244image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
11169
 
1.4%
11361
 
1.2%
11757
 
1.2%
11855
 
1.1%
11454
 
1.1%
12254
 
1.1%
15054
 
1.1%
12854
 
1.1%
12453
 
1.1%
14052
 
1.1%
Other values (241)4335
88.5%
ValueCountFrequency (%)
91
< 0.1%
101
< 0.1%
182
< 0.1%
191
< 0.1%
211
< 0.1%
ValueCountFrequency (%)
4401
< 0.1%
366.51
< 0.1%
3441
< 0.1%
3131
< 0.1%
307.51
< 0.1%

density
Real number (ℝ≥0)

Distinct890
Distinct (%)18.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9940273765
Minimum0.98711
Maximum1.03898
Zeros0
Zeros (%)0.0%
Memory size38.4 KiB
2021-03-16T11:05:09.349580image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.98711
5-th percentile0.9896385
Q10.9917225
median0.99374
Q30.9961
95-th percentile0.999
Maximum1.03898
Range0.05187
Interquartile range (IQR)0.0043775

Descriptive statistics

Standard deviation0.002990906917
Coefficient of variation (CV)0.003008877811
Kurtosis9.793806911
Mean0.9940273765
Median Absolute Deviation (MAD)0.00214
Skewness0.9777730049
Sum4868.74609
Variance8.945524186 × 106
MonotocityNot monotonic
2021-03-16T11:05:10.020832image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
0.99264
 
1.3%
0.992861
 
1.2%
0.993253
 
1.1%
0.99352
 
1.1%
0.993450
 
1.0%
0.993849
 
1.0%
0.992747
 
1.0%
0.994446
 
0.9%
0.994845
 
0.9%
0.995444
 
0.9%
Other values (880)4387
89.6%
ValueCountFrequency (%)
0.987111
< 0.1%
0.987131
< 0.1%
0.987221
< 0.1%
0.98741
< 0.1%
0.987422
< 0.1%
ValueCountFrequency (%)
1.038981
< 0.1%
1.01032
< 0.1%
1.002952
< 0.1%
1.002411
< 0.1%
1.00241
< 0.1%

pH
Real number (ℝ≥0)

Distinct103
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.188266639
Minimum2.72
Maximum3.82
Zeros0
Zeros (%)0.0%
Memory size38.4 KiB
2021-03-16T11:05:10.692419image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum2.72
5-th percentile2.96
Q13.09
median3.18
Q33.28
95-th percentile3.46
Maximum3.82
Range1.1
Interquartile range (IQR)0.19

Descriptive statistics

Standard deviation0.1510005996
Coefficient of variation (CV)0.04736134605
Kurtosis0.5307749515
Mean3.188266639
Median Absolute Deviation (MAD)0.1
Skewness0.4577825459
Sum15616.13
Variance0.02280118108
MonotocityNot monotonic
2021-03-16T11:05:11.261706image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
3.14172
 
3.5%
3.16164
 
3.3%
3.22146
 
3.0%
3.19145
 
3.0%
3.18138
 
2.8%
3.2137
 
2.8%
3.08136
 
2.8%
3.15136
 
2.8%
3.1135
 
2.8%
3.12134
 
2.7%
Other values (93)3455
70.5%
ValueCountFrequency (%)
2.721
 
< 0.1%
2.741
 
< 0.1%
2.771
 
< 0.1%
2.793
0.1%
2.83
0.1%
ValueCountFrequency (%)
3.821
< 0.1%
3.811
< 0.1%
3.82
< 0.1%
3.791
< 0.1%
3.772
< 0.1%

sulphates
Real number (ℝ≥0)

Distinct79
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.4898468763
Minimum0.22
Maximum1.08
Zeros0
Zeros (%)0.0%
Memory size38.4 KiB
2021-03-16T11:05:11.941466image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.22
5-th percentile0.34
Q10.41
median0.47
Q30.55
95-th percentile0.71
Maximum1.08
Range0.86
Interquartile range (IQR)0.14

Descriptive statistics

Standard deviation0.1141258339
Coefficient of variation (CV)0.2329826717
Kurtosis1.59092963
Mean0.4898468763
Median Absolute Deviation (MAD)0.07
Skewness0.9771936833
Sum2399.27
Variance0.01302470597
MonotocityNot monotonic
2021-03-16T11:05:12.495793image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
0.5249
 
5.1%
0.46225
 
4.6%
0.44216
 
4.4%
0.38214
 
4.4%
0.42181
 
3.7%
0.48179
 
3.7%
0.45178
 
3.6%
0.47172
 
3.5%
0.4168
 
3.4%
0.54167
 
3.4%
Other values (69)2949
60.2%
ValueCountFrequency (%)
0.221
 
< 0.1%
0.231
 
< 0.1%
0.254
 
0.1%
0.264
 
0.1%
0.2713
0.3%
ValueCountFrequency (%)
1.081
< 0.1%
1.061
< 0.1%
1.011
< 0.1%
11
< 0.1%
0.991
< 0.1%

alcohol
Real number (ℝ)

Distinct145
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1030780645
Minimum-0.148345256
Maximum0.142
Zeros0
Zeros (%)0.0%
Memory size38.4 KiB
2021-03-16T11:05:13.102785image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum-0.148345256
5-th percentile0.088
Q10.094
median0.104
Q30.114
95-th percentile0.127
Maximum0.142
Range0.290345256
Interquartile range (IQR)0.02

Descriptive statistics

Standard deviation0.02473832869
Coefficient of variation (CV)0.2399960535
Kurtosis60.50760325
Mean0.1030780645
Median Absolute Deviation (MAD)0.01
Skewness-6.747872236
Sum504.8763602
Variance0.0006119849066
MonotocityNot monotonic
2021-03-16T11:05:13.727643image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
0.094228
 
4.7%
0.095227
 
4.6%
0.092197
 
4.0%
0.09184
 
3.8%
0.1162
 
3.3%
0.11158
 
3.2%
0.105156
 
3.2%
0.104152
 
3.1%
0.091142
 
2.9%
0.108135
 
2.8%
Other values (135)3157
64.5%
ValueCountFrequency (%)
-0.1483452561
< 0.1%
-0.14758478791
< 0.1%
-0.1451260271
< 0.1%
-0.14484452251
< 0.1%
-0.14376777491
< 0.1%
ValueCountFrequency (%)
0.1421
 
< 0.1%
0.14051
 
< 0.1%
0.145
0.1%
0.1393
0.1%
0.1382
 
< 0.1%

success
Real number (ℝ≥0)

Distinct88
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean55.07268273
Minimum10
Maximum99
Zeros0
Zeros (%)0.0%
Memory size38.4 KiB
2021-03-16T11:05:14.398593image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile12
Q119
median60
Q383
95-th percentile89
Maximum99
Range89
Interquartile range (IQR)64

Descriptive statistics

Standard deviation28.76398764
Coefficient of variation (CV)0.5222913832
Kurtosis-1.442068227
Mean55.07268273
Median Absolute Deviation (MAD)24
Skewness-0.3506963182
Sum269746
Variance827.3669849
MonotocityNot monotonic
2021-03-16T11:05:14.977478image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
83193
 
3.9%
88182
 
3.7%
82181
 
3.7%
86175
 
3.6%
84172
 
3.5%
19167
 
3.4%
89166
 
3.4%
81165
 
3.4%
87162
 
3.3%
85160
 
3.3%
Other values (78)3175
64.8%
ValueCountFrequency (%)
1058
1.2%
11138
2.8%
12139
2.8%
13133
2.7%
14141
2.9%
ValueCountFrequency (%)
991
 
< 0.1%
961
 
< 0.1%
951
 
< 0.1%
944
0.1%
931
 
< 0.1%

pricing
Categorical

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size38.4 KiB
Budget
2223 
Medium
1792 
Expensive
883 

Length

Max length9
Median length6
Mean length6.540832993
Min length6

Characters and Unicode

Total characters32037
Distinct characters15
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBudget
2nd rowExpensive
3rd rowMedium
4th rowMedium
5th rowExpensive
ValueCountFrequency (%)
Budget2223
45.4%
Medium1792
36.6%
Expensive883
 
18.0%
2021-03-16T11:05:16.550117image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-03-16T11:05:17.109771image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
budget2223
45.4%
medium1792
36.6%
expensive883
 
18.0%

Most occurring characters

ValueCountFrequency (%)
e5781
18.0%
u4015
12.5%
d4015
12.5%
i2675
8.3%
B2223
 
6.9%
g2223
 
6.9%
t2223
 
6.9%
M1792
 
5.6%
m1792
 
5.6%
E883
 
2.8%
Other values (5)4415
13.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter27139
84.7%
Uppercase Letter4898
 
15.3%

Most frequent character per category

ValueCountFrequency (%)
e5781
21.3%
u4015
14.8%
d4015
14.8%
i2675
9.9%
g2223
 
8.2%
t2223
 
8.2%
m1792
 
6.6%
x883
 
3.3%
p883
 
3.3%
n883
 
3.3%
Other values (2)1766
 
6.5%
ValueCountFrequency (%)
B2223
45.4%
M1792
36.6%
E883
 
18.0%

Most occurring scripts

ValueCountFrequency (%)
Latin32037
100.0%

Most frequent character per script

ValueCountFrequency (%)
e5781
18.0%
u4015
12.5%
d4015
12.5%
i2675
8.3%
B2223
 
6.9%
g2223
 
6.9%
t2223
 
6.9%
M1792
 
5.6%
m1792
 
5.6%
E883
 
2.8%
Other values (5)4415
13.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII32037
100.0%

Most frequent character per block

ValueCountFrequency (%)
e5781
18.0%
u4015
12.5%
d4015
12.5%
i2675
8.3%
B2223
 
6.9%
g2223
 
6.9%
t2223
 
6.9%
M1792
 
5.6%
m1792
 
5.6%
E883
 
2.8%
Other values (5)4415
13.8%

country
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.4 KiB
Spain
2489 
Italy
2409 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters24490
Distinct characters9
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowItaly
2nd rowSpain
3rd rowItaly
4th rowSpain
5th rowSpain
ValueCountFrequency (%)
Spain2489
50.8%
Italy2409
49.2%
2021-03-16T11:05:18.511608image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-03-16T11:05:18.973322image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
spain2489
50.8%
italy2409
49.2%

Most occurring characters

ValueCountFrequency (%)
a4898
20.0%
S2489
10.2%
p2489
10.2%
i2489
10.2%
n2489
10.2%
I2409
9.8%
t2409
9.8%
l2409
9.8%
y2409
9.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter19592
80.0%
Uppercase Letter4898
 
20.0%

Most frequent character per category

ValueCountFrequency (%)
a4898
25.0%
p2489
12.7%
i2489
12.7%
n2489
12.7%
t2409
12.3%
l2409
12.3%
y2409
12.3%
ValueCountFrequency (%)
S2489
50.8%
I2409
49.2%

Most occurring scripts

ValueCountFrequency (%)
Latin24490
100.0%

Most frequent character per script

ValueCountFrequency (%)
a4898
20.0%
S2489
10.2%
p2489
10.2%
i2489
10.2%
n2489
10.2%
I2409
9.8%
t2409
9.8%
l2409
9.8%
y2409
9.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII24490
100.0%

Most frequent character per block

ValueCountFrequency (%)
a4898
20.0%
S2489
10.2%
p2489
10.2%
i2489
10.2%
n2489
10.2%
I2409
9.8%
t2409
9.8%
l2409
9.8%
y2409
9.8%

Interactions

2021-03-16T11:02:50.291845image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:02:50.878483image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:02:52.027773image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:02:52.848267image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:02:53.403924image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:02:54.291377image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:02:54.867020image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:02:55.527614image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:02:56.599953image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:02:57.434436image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:02:58.282913image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:02:59.663063image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:00.872316image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:02.347453image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:03.753105image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:04.974863image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:05.843842image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:06.970659image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:08.403688image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:09.132237image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:10.146611image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:10.927130image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:11.571732image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:12.458186image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:13.469561image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:14.594905image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:15.594215image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:16.793578image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:18.133890image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:19.871359image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:21.440423image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:22.990494image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:24.444628image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:25.440626image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:26.207154image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:27.238517image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:28.177938image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:30.477795image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:32.259281image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:34.217230image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:36.061763image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:37.014575image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:37.742308image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:38.531821image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:39.316553image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:40.170091image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:41.069624image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:42.473472image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:44.164893image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:45.382673image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:46.414031image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:47.454003image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:48.259595image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:49.061517image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:49.916414image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:50.649266image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:51.418401image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:52.152498image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:52.965811image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:53.741617image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:54.589286image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:55.340611image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:56.094146image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:56.843684image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:57.541101image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:58.323288image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:59.051465image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:03:59.769020image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:00.771570image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:01.722951image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:02.712032image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:03.593317image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:04.328645image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:05.419971image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:06.136529image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:07.241323image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:08.070235image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:08.741215image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:09.473764image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:10.209829image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:10.962350image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:11.742868image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:12.529292image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:13.338793image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:14.090282image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:14.860932image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:15.553294image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:16.662746image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:18.076343image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:19.058737image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:19.838256image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:20.574800image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:21.316344image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:22.173814image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:22.988312image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:24.009682image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:24.695259image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:25.351853image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:26.083402image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:26.737998image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:27.423576image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:28.083952image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:28.775524image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:29.511864image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:30.273991image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:31.073110image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:31.813657image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:32.824330image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:33.813911image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:34.820198image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:35.583102image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:36.362803image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:37.106346image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:37.952834image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:38.755623image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:39.594507image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:40.461791image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:41.219965image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:42.158203image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:42.957709image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:43.755954image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:44.972217image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:45.681776image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:46.375798image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:47.178871image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:48.030346image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:49.163507image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:50.143075image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:51.074496image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:51.788124image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:52.474701image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-16T11:04:53.128296image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2021-03-16T11:05:19.537747image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-03-16T11:05:21.496795image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-03-16T11:05:23.291047image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-03-16T11:05:24.784681image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-03-16T11:05:26.138696image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-03-16T11:04:54.415941image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-03-16T11:04:56.338816image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

fixed acidityvolatile aciditycitric acidresidual sugarchloridesfree sulfur dioxidetotal sulfur dioxidedensitypHsulphatesalcoholsuccesspricingcountry
07.00.270.3620.70.04545.0170.01.00103.000.450.08884.0BudgetItaly
16.30.300.341.60.04914.0132.00.99403.300.490.09512.0ExpensiveSpain
28.10.280.406.90.05030.097.00.99513.260.440.10154.0MediumItaly
37.20.230.328.50.05847.0186.00.99563.190.400.09988.0MediumSpain
47.20.230.328.50.05847.0186.00.99563.190.400.09980.0ExpensiveSpain
58.10.280.406.90.05030.097.00.99513.260.440.10160.0BudgetItaly
66.20.320.167.00.04530.0136.00.99493.180.470.09656.0BudgetSpain
77.00.270.3620.70.04545.0170.01.00103.000.450.08885.0BudgetItaly
86.30.300.341.60.04914.0132.00.99403.300.490.09517.0MediumSpain
98.10.220.431.50.04428.0129.00.99383.220.450.11012.0BudgetItaly

Last rows

fixed acidityvolatile aciditycitric acidresidual sugarchloridesfree sulfur dioxidetotal sulfur dioxidedensitypHsulphatesalcoholsuccesspricingcountry
48886.80.2200.361.200.05238.0127.00.993303.040.540.09212.0ExpensiveSpain
48894.90.2350.2711.750.03034.0118.00.995403.070.500.09485.0MediumSpain
48906.10.3400.292.200.03625.0100.00.989383.060.440.11815.0BudgetSpain
48915.70.2100.320.900.03838.0121.00.990743.240.460.10664.0BudgetItaly
48926.50.2300.381.300.03229.0112.00.992983.290.540.09718.0ExpensiveItaly
48936.20.2100.291.600.03924.092.00.991143.270.500.11219.0BudgetSpain
48946.60.3200.368.000.04757.0168.00.994903.150.460.09689.0MediumSpain
48956.50.2400.191.200.04130.0111.00.992542.990.460.09411.0BudgetItaly
48965.50.2900.301.100.02220.0110.00.988693.340.380.12816.0BudgetSpain
48976.00.2100.380.800.02022.098.00.989413.260.320.11812.0MediumSpain

Duplicate rows

Most frequent

fixed acidityvolatile aciditycitric acidresidual sugarchloridesfree sulfur dioxidetotal sulfur dioxidedensitypHsulphatesalcoholsuccesspricingcountrycount
57.00.150.2814.70.05129.0149.00.997922.960.390.09083.0BudgetSpain3
06.40.220.341.40.02356.0115.00.989583.180.700.11715.0MediumSpain2
16.60.390.3911.90.05751.0221.00.998513.260.510.08981.0BudgetItaly2
26.80.180.3012.80.06219.0171.00.998083.000.520.09086.0MediumItaly2
36.90.400.1712.90.03359.0186.00.997543.080.490.09484.0BudgetSpain2
46.90.400.1712.90.03359.0186.00.997543.080.490.09487.0ExpensiveItaly2
67.00.310.267.40.06928.0160.00.995403.130.460.09883.0BudgetItaly2
77.00.440.2412.10.05668.0210.00.997183.050.500.09581.0BudgetItaly2
87.20.230.1913.70.05247.0197.00.998653.120.530.09048.0MediumSpain2
97.40.160.3013.70.05633.0168.00.998252.900.440.08780.0MediumItaly2